Sparsity-Based Estimation of a Panel Quantile Count Data Model with Applications to Big Data∗

نویسندگان

  • Matthew Harding
  • Carlos Lamarche
چکیده

In this paper we introduce a panel quantile estimator for count data with individual heterogeneity, by constructing continuous variables whose conditional quantiles have a one-to-one relationship with the conditional count response variable. The new method is needed as a result of the increased availability of Big Data, which allows us to track event counts at the individual level for a large number of activities from webclicks and retweets to store visits and purchases. At the same time, the presence of many different subpopulations in a large dataset requires us to pay close attention to individual heterogeneity. In this paper, we propose a penalized quantile regression estimator with fixed effects and investigate the conditions under which the slope parameter estimator is asymptotically Gaussian. We investigate solutions to the computational challenges resulting from the need to estimate tens of thousands of parameters in a Big Data setting and caution against penalizing in models with substantial zero inflation and endogenous covariates by using a series of Monte Carlo simulations. We present an empirical application to individual trip counts to the store based on a large panel of food purchase transactions. JEL: C21, C23, C25, C55.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Quantile Regression with Adaptive Lasso Penalty for Dynamic Panel Data

‎Dynamic panel data models include the important part of medicine‎, ‎social and economic studies‎. ‎Existence of the lagged dependent variable as an explanatory variable is a sensible trait of these models‎. ‎The estimation problem of these models arises from the correlation between the lagged depended variable and the current disturbance‎. ‎Recently‎, ‎quantile regression to analyze dynamic pa...

متن کامل

A NOVEL FUZZY-BASED SIMILARITY MEASURE FOR COLLABORATIVE FILTERING TO ALLEVIATE THE SPARSITY PROBLEM

Memory-based collaborative filtering is the most popular approach to build recommender systems. Despite its success in many applications, it still suffers from several major limitations, including data sparsity. Sparse data affect the quality of the user similarity measurement and consequently the quality of the recommender system. In this paper, we propose a novel user similarity measure based...

متن کامل

The examination of relationship between socioeconomic factors and number of tuberculosis using quantile regression model for count data in Iran 2010-2011

Background: Poverty and low socioeconomic status are the most important reasons of increasing the global burden of tuberculosis, not only in developing countries but also in developed countries for particular groups. The purpose of this study was to assess the association between socioeconomic factors and the number of tuberculosis patients using quantile regression for count data.   Me...

متن کامل

Estimation of Count Data using Bivariate Negative Binomial Regression Models

Abstract Negative binomial regression model (NBR) is a popular approach for modeling overdispersed count data with covariates. Several parameterizations have been performed for NBR, and the two well-known models, negative binomial-1 regression model (NBR-1) and negative binomial-2 regression model (NBR-2), have been applied. Another parameterization of NBR is negative binomial-P regression mode...

متن کامل

Bayesian Quantile Regression with Adaptive Elastic Net Penalty for Longitudinal Data

Longitudinal studies include the important parts of epidemiological surveys, clinical trials and social studies. In longitudinal studies, measurement of the responses is conducted repeatedly through time. Often, the main goal is to characterize the change in responses over time and the factors that influence the change. Recently, to analyze this kind of data, quantile regression has been taken ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014